{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright (c) MONAI Consortium  \n",
    "Licensed under the Apache License, Version 2.0 (the \"License\");  \n",
    "you may not use this file except in compliance with the License.  \n",
    "You may obtain a copy of the License at  \n",
    "&nbsp;&nbsp;&nbsp;&nbsp;http://www.apache.org/licenses/LICENSE-2.0  \n",
    "Unless required by applicable law or agreed to in writing, software  \n",
    "distributed under the License is distributed on an \"AS IS\" BASIS,  \n",
    "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  \n",
    "See the License for the specific language governing permissions and  \n",
    "limitations under the License.\n",
    "\n",
    "# MONAI Auto3DSeg AutoRunner\n",
    "\n",
    "This notebook will introduce `AutoRunner`, the interface to run the Auto3Dseg pipeline with minimal user inputs.\n",
    "\n",
    "Specifically, it will show the features below:\n",
    "1. Use `AutoRunner` with an input config file `input.yaml` example\n",
    "2. How to prepare the config file `input.yaml`\n",
    "3. How to configure the paths for inputs, outputs, and intermediate results\n",
    "4. How to set the internal parameters of **Auto3DSeg** components\n",
    "5. How to use a third-party hyperparameter optimization(HPO) package with `AutoRunner`"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup environment\n",
    "\n",
    "The **Auto3DSeg** modules support multiple GPU since MONAI 1.2.\n",
    "The GPU devices are selected via setting the environment variable `CUDA_VISIBLE_DEVICES`.\n",
    "For example, if the users need to select GPU 0 to run this notebook, please add the `%env CUDA_VISIBLE_DEVICES=0` in the next cell"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!python -c \"import monai\" || pip install -q \"monai-weekly[nibabel, nni, tqdm, cucim, yaml, optuna]\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import tempfile\n",
    "\n",
    "from monai.bundle.config_parser import ConfigParser\n",
    "from monai.apps import download_and_extract\n",
    "\n",
    "from monai.apps.auto3dseg import AutoRunner\n",
    "from monai.config import print_config\n",
    "\n",
    "print_config()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Download dataset\n",
    "\n",
    "We provide a toy datalist file that splits a subset of the downloaded datasets into five folds."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "directory = os.environ.get(\"MONAI_DATA_DIRECTORY\")\n",
    "if directory is not None:\n",
    "    os.makedirs(directory, exist_ok=True)\n",
    "root_dir = tempfile.mkdtemp() if directory is None else directory\n",
    "print(root_dir)\n",
    "\n",
    "msd_task = \"Task04_Hippocampus\"\n",
    "resource = \"https://msd-for-monai.s3-us-west-2.amazonaws.com/\" + msd_task + \".tar\"\n",
    "\n",
    "compressed_file = os.path.join(root_dir, msd_task + \".tar\")\n",
    "dataroot = os.path.join(root_dir, msd_task)\n",
    "if not os.path.exists(dataroot):\n",
    "    download_and_extract(resource, compressed_file, root_dir)\n",
    "\n",
    "datalist_file = os.path.join(\"..\", \"tasks\", \"msd\", msd_task, \"msd_\" + msd_task.lower() + \"_folds.json\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Prepare a input YAML configuration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "input_cfg = {\n",
    "    \"name\": msd_task,  # optional, it is only for your own record\n",
    "    \"task\": \"segmentation\",  # optional, it is only for your own record\n",
    "    \"modality\": \"MRI\",  # required\n",
    "    \"datalist\": datalist_file,  # required\n",
    "    \"dataroot\": dataroot,  # required\n",
    "}\n",
    "input = \"./input.yaml\"\n",
    "ConfigParser.export_config_file(input_cfg, input)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Run the Auto3DSeg pipeline in a few lines of code\n",
    "\n",
    "Below is the typical usage of AutoRunner\n",
    "```python\n",
    "runner = AutoRunner(input=input)\n",
    "runner.run()\n",
    "```\n",
    "\n",
    "The `run` command will take a long time since it will train algorithms over iterations.\n",
    "\n",
    "If the user would like to perform a full training in the tutorial, it is recommended to uncomment the `runner.run()` appended at the end of each code block.\n",
    "\n",
    "## Use the default setting with the input YAML file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = AutoRunner(input=input)\n",
    "# runner.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Use the default setting with the dictionary instead of the YAML file as the input"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = AutoRunner(input=input_cfg)\n",
    "# runner.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Customize working directory\n",
    "`AutoRunner` provides the user interfaces to save all the intermediate and final results in a user-specified location.\n",
    "Here we use `./my_workspace` as an example"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = AutoRunner(work_dir=\"./my_workspace\", input=input)\n",
    "# runner.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Customize result caching\n",
    "\n",
    "AutoRunner saves intermediate results by default to save computation time.\n",
    "The user can choose whether it uses the cached results or restart from scratch.\n",
    "\n",
    "If the users want to start from scratch, they can set `not_use_cache` to True"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# This will restart from scratch and not use any cached results\n",
    "runner = AutoRunner(input=input, not_use_cache=True)\n",
    "# runner.run()\n",
    "\n",
    "# Below will skip data analysis.\n",
    "# Because data analysis was NOT completed and cache before, AutoRunner will throw an error\n",
    "\n",
    "# runner = AutoRunner(input=input, analyze=False)  # This will throw error"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Customize the output folder to save ensemble result\n",
    "\n",
    "AutoRunner will perform inference on the testing data specified by the `datalist` in the data source config input. The inference result will be written to the `ensemble_output` folder under the working directory in the form of `nii.gz`. The user can choose the format by adding keyword arguments to the AutoRunner. A list of argument can be found in [MONAI tranforms documentation](https://docs.monai.io/en/stable/transforms.html#saveimage)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = AutoRunner(input=input, output_dir=\"./output_dir\")\n",
    "# runner.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setting Auto3DSeg internal parameters\n",
    "`Auto3DSeg` has four steps: data analysis, algorithm generation, training, and ensemble. Users can configure the internal parameters of the `AutoRunner` object to customize some steps in the pipeline.\n",
    "\n",
    "Below, we begin the experiments with a smaller number of cross-validation folds. The default is 5 in the algorithm but we set it to 2 here:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = AutoRunner(input=input)\n",
    "runner.set_num_fold(num_fold=2)\n",
    "# runner.run()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Customize training parameters by override the default values\n",
    "\n",
    "`set_training_params` in `AutoRunner` provides an interface to change all algorithms' training parameters in one line. \n",
    "\n",
    "As an example, see the code block below, which specifies e.g. the number of epochs used for training. Note that some algorithms may treat this as a maximum number of epochs.\n",
    "\n",
    "NOTE: \n",
    "The setup works fine for a machine that has GPUs less than or equal to 8.\n",
    "The datalist in this example is only using a subset of the original dataset.\n",
    "Users need to ensure the number of GPUs is not greater than the number that the training dataset can be partitioned.\n",
    "For example, the following code block is not suitable for a 16-GPU system.\n",
    "In such cases, please change the code block accordingly.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "max_epochs = 2\n",
    "\n",
    "train_param = {\n",
    "    \"num_epochs_per_validation\": 1,\n",
    "    \"num_images_per_batch\": 2,\n",
    "    \"num_epochs\": max_epochs,\n",
    "    \"num_warmup_epochs\": 1,\n",
    "}\n",
    "\n",
    "runner = AutoRunner(input=input)\n",
    "runner.set_training_params(params=train_param)\n",
    "# runner.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Customize the ensemble method\n",
    "\n",
    "There are two supported methods: \"AlgoEnsembleBestN\" and \"AlgoEnsembleBestByFold\"\n",
    "\n",
    "> NOTE: if the users need to change the ensemble method to \"AlgoEnsembleBestByFold\" and number of folds in one experiment, please call the function `set_num_fold` prior to `set_ensemble_method` to ensure the number of folds is set correctly in the ensemble module."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = AutoRunner(input=input)\n",
    "runner.set_ensemble_method(ensemble_method_name=\"AlgoEnsembleBestByFold\")\n",
    "# runner.run()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Customize the inference parameters by override the default values\n",
    "\n",
    "Check the [MONAI Auto3DSeg API Documentation](https://docs.monai.io/en/stable/apps.html#module-monai.apps.auto3dseg) for more information.\n",
    "\n",
    "NOTE: **Auto3DSeg** ensemble uses multiple GPUs by default.\n",
    "Please avoid allocating more GPUs than the number of files, e.g. use 2 GPUs but there is 1 file in `infer_files` or `file_slices`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# set model ensemble method\n",
    "pred_params = {\n",
    "    \"mode\": \"vote\",  # use majority vote instead of mean to ensemble the predictions\n",
    "    \"sigmoid\": True,  # when to use sigmoid to binarize the prediction and output the label\n",
    "}\n",
    "runner = AutoRunner(input=input)\n",
    "runner.set_prediction_params(params=pred_params)\n",
    "# runner.run()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train model with HPO\n",
    "\n",
    "**Auto3DSeg** supports hyper parameter optimization (HPO) via `NNI` and `Optuna` backends.\n",
    "If you would like to the use `Optuna`, please check the [notebook](hpo_optuna.ipynb) for detailed usage.\n",
    "\n",
    "Here we demonstrate the HPO option with `NNI` by Microsoft.\n",
    "Please install it via `pip install nni` if you hope to execute HPO with it in tutorial and haven't done so in the beginning of the notebook.\n",
    "AutoRunner supports `NNI` backend with a grid search method via automatically generating a the `NNI` config and run `nnictl` commands in subprocess.\n",
    "\n",
    "## Use `AutoRunner` with `NNI` backend to perform grid search\n",
    "\n",
    "After `runner.run()` is executed, `nni` will attempt to start a web service using port 8088 by default. If you are running the tutorial in a remote host, please ensure the port is available on the system.\n",
    "\n",
    "> NOTE: it is recommended to turn off ensemble if the users are using HPO features.\n",
    "> By default, all the models are saved under the working directory, including the ones tuned by the HPO package.\n",
    "> Users may want to read the HPO results before taking the next step.\n",
    "> If the users want to ensemble all the models, the `ensemble` option can be set to True."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = AutoRunner(input=input, hpo=True, ensemble=False)\n",
    "search_space = {\"learning_rate\": {\"_type\": \"choice\", \"_value\": [0.0001, 0.001, 0.01, 0.1]}}\n",
    "runner.set_nni_search_space(search_space)\n",
    "# runner.run()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Override the templated values\n",
    "\n",
    "The default `NNI` config that `AutoRunner` looks like below. User can override some of the parameters via the `set_hpo_params` interface:\n",
    "\n",
    "```python\n",
    "import torch\n",
    "default_nni_config = {\n",
    "    \"trialCodeDirectory\": \".\",\n",
    "    \"trialGpuNumber\": torch.cuda.device_count(),\n",
    "    \"trialConcurrency\": 1,\n",
    "    \"maxTrialNumber\": 10,\n",
    "    \"maxExperimentDuration\": \"1h\",\n",
    "    \"tuner\": {\"name\": \"GridSearch\"},\n",
    "    \"trainingService\": {\"platform\": \"local\", \"useActiveGpu\": True},\n",
    "}\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = AutoRunner(input=input, hpo=True, ensemble=False)\n",
    "hpo_params = {\"maxTrialNumber\": 20}\n",
    "search_space = {\"learning_rate\": {\"_type\": \"choice\", \"_value\": [0.0001, 0.001, 0.01, 0.1]}}\n",
    "runner.set_hpo_params(params=hpo_params)\n",
    "runner.set_nni_search_space(search_space)\n",
    "# runner.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Quick demo - how to customize the HPO in `AutoRunner`\n",
    "\n",
    "Here we provide an example for the users to customize and override the HPO in `AutoRunner`.\n",
    "\n",
    "After the `runner.run()`, users may check the \"Trials detail\" in the NNI web server to see the progress of trainings.\n",
    "\n",
    "> NOTE: Users may refer to the [HPO documentation](../docs/hpo.md) for the meaning of the keys in the overriding parameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "runner = AutoRunner(input=input, algos=(\"segresnet\"), hpo=True, ensemble=False)\n",
    "num_epoch = 2\n",
    "hpo_params = {\n",
    "    \"maxTrialNumber\": 20,\n",
    "    \"maxExperimentDuration\": \"30m\",\n",
    "    \"num_epochs_per_validation\": 1,\n",
    "    \"num_images_per_batch\": 1,\n",
    "    \"num_epochs\": 2,\n",
    "    \"num_warmup_epochs\": 1,\n",
    "    \"training#num_epochs\": 2,\n",
    "    \"training#num_epochs_per_validation\": 1,\n",
    "    \"searching#num_epochs\": 2,\n",
    "    \"searching#num_epochs_per_validation\": 1,\n",
    "    \"searching#num_warmup_epochs\": 1,\n",
    "    \"training#auto_scale_allowed\": False,  # disable the auto scale so that num_epochs can be set\n",
    "    \"auto_scale_allowed\": False,  # disable the auto scale so that num_epochs can be set\n",
    "}\n",
    "search_space = {\"learning_rate\": {\"_type\": \"choice\", \"_value\": [0.0001, 0.01]}}\n",
    "runner.set_num_fold(num_fold=1)\n",
    "runner.set_hpo_params(params=hpo_params)\n",
    "runner.set_nni_search_space(search_space)\n",
    "# runner.run()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For more details about the usage of **Auto3DSeg** HPO features, please check the [Auto3DSeg NNI Notebok](./hpo_nni.ipynb) and [Auto3DSeg Optuna Notebook](./hpo_optuna.ipynb)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conclusion\n",
    "\n",
    "Here we demonstrate how to use the AutoRunner APIs to customize your **Auto3DSeg** pipeline with mininal inputs. Don't forget you need to execute the `run` command to start the training and make everything take effect.\n",
    "\n",
    "```python\n",
    "runner.run()\n",
    "```"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.13"
  },
  "vscode": {
   "interpreter": {
    "hash": "916dbcbb3f70747c44a77c7bcd40155683ae19c65e1c03b4aa3499c5328201f1"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
